Effects of a cloud filtering method for Fengyun-3C Microwave Humidity and 
Temperature Sounder measurements over ocean on retrievals of temperature and 
humidity 
HE Qiurui’” WANG Zhenzhan' HE Jieying' 

1. Key Laboratory of Microwave Remote Sensing, National Space Science Center, Chinese Academy of Sciences, Beijing 100190, 
China; 

2. University of Chinese Academy of Sciences, Beijing 100049, China 
Abstract: For Microwave Humidity and Temperature sounder (MWHTS) measurements over ocean, a cloud 
filtering method is presented to filter out cloud- and precipitation-affected observations by analyzing the 
sensitivity of the simulated brightness temperatures of MWHTS to cloud liquid water, and using the root 
mean square error (RMSE) between observation and simulation in clear sky as a reference standard. The 
atmospheric temperature and humidity profiles are retrieved using MWHTS measurements with and without 
filtering by multiple linear regression (MLR), artificial neural networks (ANN) and one-dimensional 
variational (1DVAR) retrieval methods, respectively, and the effects of the filtering method on the retrieval 
accuracies are analyzed. The numerical results show that the filtering method can improve the retrieval 
accuracies of the MLR and the 1DVAR retrieval methods, but have little influence on that of the ANN. In 
addition, the dependencies of the retrieval methods upon the testing samples of brightness temperature are 
studied, and the results show that the 1DVAR retrieval method has great stability due to that the testing 
samples have great impact on the retrieval accuracies of the MLR and the ANN, but have little impact on 

that of the IDVAR. 
Index Term: FY-3C/MWHTS, cloud filtering method, multiple linear regression, artificial neural networks, 


one-dimensional variational retrieval 
1. Introduction 


The space-borne radiometer whose observation is an important data source to retrieve the atmospheric 
parameters, measure the thermal radiation coming from the Earth surface and atmosphere (Moradi et al.'"}). 
In addition to the absorption and emission of the atmospheric constituents, the propagation of 
electromagnetic wave in atmosphere will be influenced by the scatterings of clouds and precipitation. The 
scatterings which are dependent on cloud cover, cloud liquid water content, cloud ice water content, 
raindrops size, and other microphysical parameters, make the process of atmospheric radiative transfer more 
complex than the scatterings in clear sky, and increase the nonlinearity of radiative transfer equation and the 
difficulty of retrieving the atmospheric parameters (Navas-Guzman et al.'”!), Because clouds do not absorb 
and scatter microwave radiation as strong as that of visible or infrared radiation, microwave radiometer has 
the strong ability in all-day continuous detection (Perro et al), However, the cloud filtering method which 
can filter out microwave observations affected by clouds and precipitation, is critical to insure retrieval 
accuracy in the retrieval process. Although the development of atmospheric temperature and humidity 


profiles measured by satellite-borne sounders has a history of over 50 years (Palyakov et al.“!; Tan et al.°, 


the retrieval strategies can be classified into two categories: statistical methods and physical methods. 
Statistical methods include multiple linear regression, artificial neural networks and so on, do not involve 
any physical model, and use the statistical relationship between the atmospheric parameters and the 
observations to get the retrievals. When there are thick clouds or/and rain in the sight of sounder, the 
statistical model may be more complicated and inaccurate, and directly affect the retrieval accuracy. For 
physical retrieval methods, modeling atmospheric radiative transfer accurately is the priority. Because the 
electromagnetic wave scattering mechanism in clouds and rain is complicated, the simulations of scattering 
are always difficult and inaccurate. This increases the uncertainty between the simulations and observations 
in the inversion of the radiative transfer equation, and further has an adverse effect on the retrieval accuracy. 
In a word, how to deal with cloud- and precipitation-affected observations of satellite-borne microwave 
radiometer is particularly important in the inversion. 

Many previous studies have developed cloud filtering methods for microwave observations, where 
thick cloud or/and rain are referred as the term of cloud to handle the cloud- and precipitation-affected 
observations. Karstens et al. took the values of relative humidity profiles coming from the meteorological 
observation data as the threshold to determine the clear sky case!®!, Ishimoto et al. and Li et al. filtered out 
the cloud-affected brightness temperatures according to the infrared cloud image or cloud products!”*!, Both 
kinds of aforementioned methods obviously depend on the third-party data source which may introduce 
additional errors in retrieval process, and the cloud filtering criteria need to vary depending on the 
characteristics of third-party data. Comparing with the aforementioned two methods, taking full advantage 
of the characteristics of satellite data itself has great advantages in dealing with cloud- and 
precipitation-affected observations. The effects of clouds and rain on the measurements of satellite 
microwave radiometer have been studied through simulations and observations (Muller et al,!. Burns et 
al.!!°!, Skofronick-Jackson et al.!!!"!; Greenwald and Christopher!!*!, Bennartz and Bauer"). Hong et al, 41. 
Buehler et al.) For Advanced Microwave Sounding Unit (AMSU)-B with two window channels (at 
frequencies 89 and 150 GHz) and three water vapor channels (at frequencies 183 +1, 183 +3, 183 £7 GHz ), 
Burns et al. analyzed the correspondence between the thick clouds and brightness temperature depressions 
and suggested a criterion which is based on the differences between observed brightness temperatures at 
183.31 +3 and 183.31 +1 GHz to screen out the observations in severe convective weather. The filtering 
method improve the agreement between the observations and the simulations up to a factor of two in his 
case study!) Hong et al. developed a method to detect tropical deep convective clouds and convective 
overshooting using the brightness temperatures differences between AMSU-B water vapor channels. This 
method which was validated by two other aircraft cases and the radiative transfer model, also took the 
varying viewing angle of AMSU-B into account''*!, Buehler et al. developed a method which combined the 
threshold brightness temperatures from AMSU-B channel 18 suggested by Greenwald and threshold 
brightness temperatures differences suggested by Burns, to filter high and heavily laden ice clouds in the 
observations of AMSU-B!® '> 11, The robustness of this cloud filtering method was demonstrated by a 


mid-latitude winter case study. This method also was applied to study different biases on upper tropospheric 


humidity climatologies. It can be seen that from the cloud filtering methods based on the characteristics of 
satellite microwave measurements, different cloud filtering methods depending on application goals can be 
developed. MWHTS not only has the same water vapor channels as AMSU-B, but also add two new water 
vapor channels at frequencies 183.31 +1.8 and 183.31 £4.5 GHz. However, there is few researches on the 
cloud filtering method for MWHTS observations before the inversion of atmospheric parameters. 

In our study, we analyze the sensitivity of MWHTS water vapor channels to the cloud liquid water, take 
the RMSE of simulations with respect to observations in clear sky as the reference value, and set three cloud 
filtering criteria based on the mutual relationships among the brightness temperatures of the water vapor 
channel to screen out the cloud- and precipitation-affected observations which can't be simulated accurately 
by the radiative transfer model RTTOV (Radiative Transfer for Television and Infrared Observation Satellite 
Operational Vertical Sounder) (Hocking et al!!!) After determining the optimal cloud filtering criteria, 
MWHTS observations over ocean from 1 to 28 February 2015 before and after filtering are utilized to 
retrieve the profiles of atmospheric temperature and water vapor by the MLR, ANN and 1DVAR retrieval 
methods, respectively, and the effects of the filtering method on the accuracies of retrievals are investigated. 
In addition, MWHTS observations over ocean from 1 to 31 May 2015 are also used to test the stabilities of 


these three retrieval methods. 
2. Data and model 


2.1. FY-3C/MWHTS 


MWHTS aboard the FY-3C satellite detects the earth-atmosphere system in a cross-track scanning 
manner, is a total power radiometer. MWHTS has 15 channels with eight temperature channels, five water 
vapor channels and two window channels. The eight temperature channels centered at 118.75 GHz oxygen 
absorption line, are used in operation for the first time internationally to measure temperature from surface 
to the upper atmosphere. The five water vapor channels centered at 183.35 GHz water vapor absorption line, 
aim to sound humidity and precipitation in the troposphere. The two window channels are placed at 89.0 and 
150.0 GHz, respectively, can provide ground information. MWHTS swath width is 2645 km. It takes 2.667 s 
to complete one scan line including 98 fields of view (FOV), and its nominal FOV at nadir is 16 km (Guo et 
al.. Some of MWHTS channel characteristics, including channel center frequency, polarization, 
bandwidth, channel sensitivity and channel sensitivity measured in flight, are listed in Table 1 (Bao!'®!), The 
weighting functions for temperature channels and humidity channels, which are calculated from the U.S. 
standard atmospheric profile at nadir by RTTOV in which the surface emissivities are set to 0.5, are shown 
in Figure 1. 


Table 1. MWHTS Channel Characteristics. 


Channel Frequency Polarization Bandwidth Sensitivity In-flight sensitivity 


(GHz) (MHz) (K) (K) 
1 89.0 V 1500 1.0 0.23 
2 118.75+0.08 H 20 3.6 1.63 
3 118.75+0.2 H 100 2.0 0.74 
4 118.75+0.3 H 165 1.6 0.59 


5 118.75+0.8 H 200 1.6 0.65 
6 118.7541.1 H 200 1.6 0.52 
7 118.75+42.5 H 200 1.6 0.49 
8 118.75+3.0 H 1000 1.0 0.27 
9 118.75+5.0 H 2000 1.0 0.27 
10 150.0 V 1500 1.0 0.34 
11 183.31+1.0 H 500 1.0 0.47 
12 183.31+1.8 H 700 1.0 0.34 
13 183.31+43.0 H 1000 1.0 0.30 
14 183.31+4.5 H 2000 1.0 0.22 
15 183.31+47.0 H 2000 1.0 0.27 
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Figure 1. Weighting Functions for MWHTS. (a)Temperature channels, (b) Humidity channels. 
2.2. Data and model 


The data used in our study include MWHTS L1b brightness temperature data products and European 
Centre for Medium Range Weather Forecasts (ECMWF) ERA Interim reanalysis data. MWHTS brightness 
temperature data products have been provided by National Satellite Meteorological Center (NSMC) 
( http://www.nsmc.cma.gov.cn/NSMC/HOME/Index.html ), ECMWF ERA Interim reanalysis data with 
horizontal resolution of 1°x1° and temporal resolution of 6 h (.e., with data available at 0000 UTC, 0600 
UTC, 1200 UTC and 1800 UTC) have been provided by ECMWF (Dee et al!!!) The surface parameters of 


ERA Interim reanalysis data used in our study including skin temperature, 2 m air temperature, 2 m air 
dewpoint temperature, 10 m wind speed, surface pressure. The profiles parameters including temperature, 
humidity, cloud liquid water and cloud ice water, have 37 pressure levels spaced from 1000 hPa to 1 hPa 
unevenly. MWHTS brightness temperatures and ERA Interim reanalysis data over ocean covering 
geographic area (135° E~165° E, 0° N~30° N) are selected to generate two datasets in our study. The first 
dataset is statistical analysis dataset which contains collected MWHTS brightness temperatures with 
ECMWFE reanalysis covered the period from 1 February 2014 to 31 January 2015, the second dataset is the 
testing dataset with the same collected data as the first except that the time period is from 1 to 28 February 


2014. The collected criteria for brightness temperatures and reanalysis are that the time difference between 


them is less than 10 min, and the absolute distance between the position (latitude and longitude) of them is 
less than 0.05° in these two datasets. Based on the collected criteria, the statistical analysis dataset obtains 
490142 collocated samples and the testing dataset obtains 37995 collocated samples. In our study, the fast 
radiative transfer model RTTOV version 11.2 developed by ECMWF is used to calculate the simulated 


brightness temperatures and gradients of brightness temperature of MWHTS. 
3. Cloud filtering method 


For the statistical analysis dataset including 490142 collocated samples in section 2.2, we use the 
absorb-based model and scatter-based model of RTTOV to simulate MWHTS brightness temperatures, 
respectively, and evaluate the simulation accuracies by RMSEs with respect to the observations. In order to 
compare the simulation accuracies of RTTOV in clear sky with that of all weathers, we choose 2485 
collocated samples in clear sky from the statistical analysis dataset which is in all weathers, according to that 
the cloud liquid water content is zero. We simulate MWHTS observations in clear sky and in all weathers, 


then get the simulation accuracies RMSE¢iear and RMSEau, respectively, which are shown in Figure 2. 
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Figure 2. RMSEs of simulated measurements with respect to observations. (a) RMSE ucar, (b) RMSE 1. 


In clear sky, it can be seen (Figure 2(a)) that the simulation accuracies by the two models are equal, the 
RMSEs in two windows channels are large, the RMSEs in temperature channels are within 2 K, except for 
channels 2, 4 and 9, and the RMSEs in humidity channels are within 3 K. In all weathers, it can be seen from 
Figure 1(a) that channels 2, 3 and 4 are sensitive to the upper atmosphere above level 100 hPa, and in which 
brightness temperatures are less affected by clouds and rains, so the simulation accuracies of absorb-based 
model are nearly equivalent to that in clear sky. But for the scatter-based model, the simulation accuracies 
are very poor, this may be due to that atmospheric parameters are inaccurate in upper atmosphere, for 


example, the cloud cover and cloud ice water content profiles. The RMSEs in the other channels is higher 


than that in clear sky. Except for channels 7, 8, 10 and 15, where the simulation accuracies of scatter-based 
model are slightly higher than that of absorb-based model, the absorb-based model are obviously superior to 
the scatter-based model. There may be two reasons: First, ERA Interim reanalysis are lack of the rain and 
snow parameters which can cause main contributions of scattering and are set to zeros in simulations of 
RTTOV; Second, the scatter-based model need more atmosphere parameters than the absorb-based model, 
and the inaccuracy in atmospheric parameters, especially in the cloud cover profile, can affect the simulation 
accuracies of RTTOV (Geer et al.”!), Based on the above analysis, we choose the absorb-based model of 
RTTOV to simulate MWHTS observations in our study. However, in order to get the higher simulation 
accuracy, the cloud- and precipitation-affected observations must be removed. 

In clear sky, for MWHTS water vapor channels, the farther frequency from the 183.31 GHz absorption 
line center, the larger brightness temperature. This is because that for channels farther from the absorption 
line center the opacity caused by water vapor is less and they can sound a warmer and lower part of the 
atmosphere. However, when the clouds and/or rain are present, the brightness temperatures of channels 
whose peak WF heights are above the clouds and/or rain, are almost unaffected. But for channels whose 
peak WF heights are near or below the clouds and/or rain, the opacity caused by water vapor increase 
apparently as the cloud water content or rain water content increase, resulting that the weighting functions 
shift upward, and the channels sound more contribution from a higher and colder part of the atmosphere. 
Most of the upwelling radiation below the cloud or rain layer is absorbed or scattered away from the satellite 
direction. This contributes to the lower brightness temperatures relative to the clear sky situation. In other 
words, the clouds and rain can cause the brightness temperature depressions (Perro et al.P). Due to that the 
sensitivities of water vapor channels to clouds and rain are different, brightness temperature differences A 
between the water vapor channels cloud be used as an indicator of the presence of clouds and rain (Hong et 
al.'"!!). We calculate the simulated brightness temperatures from U.S. standard atmospheric profile, including 
the temperature and humidity parameters at nadir by RTTOV. According to the statistical characteristics of 
cloud liquid water content and cloud ice water content of ERA Interim reanalysis covered the period from 1 
February 2014 to 31 January 2015. The variation of cloud liquid water content and cloud ice water content 
are 0-4 mm and 0-2 mm, respectively. Because the water cloud is mainly distributed below level 400 hPa of 
middle and low atmosphere, the cloud liquid water content is evenly distributed below level 350 hPa of the 
atmosphere. Since the ice cloud is mainly distributed above 600 hPa of upper atmosphere, the cloud ice 
water content is evenly distributed between level 100 hPa and 550 hPa. If there are the same content of 
clouds or/and rain in different FOV at nadir, different cloud cover can cause different brightness temperature 
(Geer et al.°!). In order to avoid the impact of cloud cover on simulations, the cloud cover in our 
simulations is set to 1.0 when clouds or rain is present. MWHTS channel 11 which is sensitivity to level 
about 350 hPa is scarcely influenced by clouds and rain, so channel 11 is chosen as the reference channel. 
A i11 @=12, 13, 14, 15) represents the brightness temperature differences between channel i and channel 11. 
Due to cloud ice water content has little effect on MWHTS brightness temperatures (Guo et al, 7h, only the 


effects of the cloud liquid water content on brightness temperature differences A ;.;; are investigated here. 


Figure 3 shows the sensitivities of brightness temperature differences A ;-1;ı to the cloud liquid water content. 
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Figure 3. The sensitivities of brightness temperature differences of water vapor channels to cloud liquid water content. 


It can be seen from Figure 3 that the brightness temperature differences A ;.;; decrease as the cloud 
liquid water content increases, and the trend of decreasing levels gradually decreases. Because the increasing 
of the cloud liquid water content causes the peak WF heights of water vapor channels to shift upward, 
leading to the brightness temperature depressions, the farther the frequency is from the center of the 183.31 
GHz line, the larger magnitude the depression is (Perro et al.'*!), However, in general, it will rain near the 
surface, when the cloud liquid water content exceeds 0.5 mm (Karstens et al. D, MWHTS water vapor 
channels are less affected by rain because that the peak WF heights of which are above level 900 hPa. This 
is the reason why the trend of decreasing levels drops down. As Figure 3 shown, for the cloud liquid water 
content, different brightness temperature differences A ;.1; have different sensitivity depending on the 
weighting function distributions for water vapor channels. A 15-11 and A 14-11 will be used to screen out the 
cloud- and precipitation-affected observations in our study due to higher sensitivity than A 12-11 and A 13-11. 
For A 15-11, A 15-11 at 11.7 K, which corresponds to the value of cloud water content is 0.5 mm, is first taken 
as the reference threshold. Take the RMSEgiear of the absorb-based model in clear sky in Figure 2 as the 
reference values. Then we adjust the reference threshold and the step length is set to 0.2 K, if the differences 
between RMSE, of the absorb-based model in all weathers shown in Figure 2 and the reference values 
RMSE gear for temperature channels are within 1 K, and for water vapor channels are within 1.5 K, we stop 
adjusting. After some adjusting, we determine that the threshold is 12.5 K. For A 14-11, the method to 
determine the threshold is the same as that of A 15-11, but the reference threshold is set to 7.1 K, then we can 
determine the threshold is 8.1 K. According to both of the above determined thresholds, we develop three 
criteria: the first is A 45.;;>12.5, the second is A j4.;;>8.1, and the third is A js5.4;>12.5 and A 44.1;>8.1. 
Applying these three criteria to the statistical analysis dataset to screen out the cloud- and 
precipitation-affected observations, we obtain three other statistical analysis datasets which are not affected 
by clouds and rain: statistical analysis dataset 1, statistical analysis dataset 2 and statistical analysis dataset 3 


including 467587, 470051 and 452683 collocated samples, respectively. A statistical analysis is performed 


on the RMSEs of simulations with respect to the observed brightness temperatures after the cloud filtering. 


Figure 4 shows the simulation accuracies by RTTOV for the filtered observations using these three criteria. 
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Figure. 4 RMSE of simulated measurements with respect to observations filtered by three criteria. 


From Figure 4, it can be seen that the simulation accuracies for each channel have been improved 
l : obviously compared with the RMSE,, in Figure 2(b), except for channels 2, 3 and 4. MWHTS channels 15 

X and 14 are sensitive to levels about 900 to 800 hPa and 800 to 700 hPa, respectively. The clouds distributed 
z . in these different layers of atmosphere can be detected well by the criteria 1: A 15-11>12.5 and the criteria 2: 
x A 14-11>8.1. However, the clouds distributed in the levels about 900 to 800 hPa have little effect on channels 
F 14, because whose water vapor absorption is mainly located above cloud top. The clouds distributed in the 
, 2 levels about 800 to 700 hPa cause decreases in brightness temperature of channels 15 due to the increased 
( absorption and the scattering by ice and water droplets in the clouds (Burns et al."°!) In other words, criteria 
1 can screen out more cloud- and precipitation-affected observations than criteria 2. Another thing to note is 
that criteria 2 has better sensitivity to high-level clouds than criteria 1, especially in levels about 800 to 700 
hPa of atmosphere. Based on the aforementioned analysis, for the water vapor channels, the simulation 
accuracies of channel 15 for the criteria | is higher than that of the criteria 2, but the criteria 2 can get better 
simulation results in channels 13 and 14. For temperature channels 6, 7, 8, 9 and window channels 1 and 10, 
compared with the criteria 2, the criteria 1 can improve the simulation accuracies in different degree, the 
biggest improvement in channel 9 is 0.4 K. The reason that the criteria 1 can identify better the low-level 
clouds in the ground layer of atmosphere where the ice and water droplets can introduce the scatterings to 
the brightness temperatures of the channels whose peak WE heights are closer to the surface. The criteria 3 
combines the criteria 1 and the criteria 2 so it can detect the clouds distributed between levels about 900 and 
700 hPa, screen out more cloud- and precipitation-affected observations, and obtain higher simulation 
accuracies than the first two criteria, as shown in Figure 4. Through the above analysis, the criteria 3 is taken 
as the cloud filtering method in this paper. Compared with AMSU-B, MWHTS can provide greater 


sensitivity to the clouds in different layers of the troposphere due to the new water vapor channel at 


frequency 183.31 +4.5 GHz. On this basis, the cloud filtering criteria 3 has better performance in filtering 
out cloud- and precipitation-affected observations and improving the simulation accuracy of RTTOV 
effectively. In order to further evaluate the performance of the cloud filtering method in inversion of 
atmospheric parameters and provide some reference values for different retrieval techniques, retrieval 


experiments are carried out. 
4. Retrieval methods 


4.1. MLR retrieval algorithm 


In essence, MLR retrieval algorithm converts the observations to the atmospheric parameters through a 
linear regression model representing the linear relationship between the radiometer observations and the 
atmospheric parameters including temperature, humidity, cloud water parameters and so on. This model is 


given by (Chen and J in^): 


M 
x,- <x, >= $ D; (Tj-<T; >), (1) 
j=l 
where x is the atmospheric state vector, i is pressure level, <> represents the statistical mean value, 


T, is the observed radiance vector, j is the index of radiometer channel, D is the retrieval operator. In 


our study, the collocated samples, including atmospheric temperature profiles, humidity profiles and 
MWHITS brightness temperatures in the statistical analysis dataset, are used to calculate the retrieval 


operator D . Then we retrieve MWHTS brightness temperatures in testing dataset to get the retrievals. 
4.2. ANN retrieval algorithm 


ANN is a statistical regression model in nature, unlike the MLR, ANN not only describes the linear 
relationship between input data and output data but also can represents any non-linear relations in theory. 
Following the vast majority of publications on applying ANN in the inversion of atmospheric parameters, in 
solving the retrieval under consideration, a three-layer BP ANN containing one hidden layer is chosen. 


Figure 5 shows a schematic diagram of the chosen network. The additional details on initialization, training, 


optimization and other advanced topics can refer to Polyakov et al. and Yao et aL 
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Figure 5. Diagram of BP ANN. 


In our study, MWHTS brightness temperatures in the statistical analysis dataset are taken as the input 


vector X,, L=1~15 is the index of MWHTS channels. The profiles parameters including temperature and 


humidity, the surface parameters including 2 m air temperature and 2 m air dewpoint temperature, and 10 m 


wind speed in the statistical analysis dataset are taken as output vector Z,, N =1~37 and N =38~74 


represent the pressure levels from the upper atmosphere to surface for temperature and humidity, 
respectively, N =75, 76, 77, 78 represent the 2 m air temperature, 2 m air dewpoint temperature, 10 m U 


wind component and 10 m V wind component. The input vector X, and the output vector Z,, comprise 


the training samples and the steepest descent method is selected in the training phase. Based on many tests, 
the hidden layer with 16 hidden nodes was found to be the best in our study. The weights and biases are 
determined through training 90% of the training samples, the other 10% of training samples are used for 


validation to determine when to stop training. 
4.3. IDVAR retrieval algorithm 


IDVAR method is a typical representative of physical retrieval approaches, which obtains the 
atmospheric parameters by solving directly the radiative transfer equation. The algorithm is mainly 
composed of two parts: the radiative transfer model used to calculate the simulated brightness temperatures 
and gradients of brightness temperature, and a scheme for minimizing the cost function which weights the 
relative contribution of satellite observations and the background information. If the errors in the 
background information and satellite observations are neither biased nor correlated, and have Gaussian 
distributions, the atmospheric state vector x can be solved through minimizing the following cost function 


(Liu and Weng!”*!): 


J= s(x —x’)" B(x —x") + IHG) IT R”[H(x)-1], (2) 


where B is the background covariance matrix, x’ is the background state vector. R is the sum of the 
covariance error in the simulated brightness temperature and the instrument channel noise. T represents 
matrix transpose. H is the forward operator which simulates the satellite observations at the atmospheric 
state vector x. I is the observed brightness temperature vector. Setting the value of the cost function 
gradient to be zero, the optimal estimate of x can be obtained by: 


x, =x’+BH' (x, )[H(x, )BH' (x,)+ RY" [1 —H(x)-(x°? x,)]. (3) 


n+l 


where n is the iteration index, the start point x, is the first guess profile, H is the tangent linear 
function of H at point x. It can be seen from equation (3) that the parameters of the retrieval algorithm 
including background covariance matrix B , background state vector x’, error covariance matrix R , first 


guess profile x, and system bias H(x)—I must be determined before retrieving. 


A priori information including background covariance matrix B and background state vector x’ has 
a direct effects on retrieval accuracy in physical retrieval process, which aim to constrain the retrievals in 
iteration in equation (3) to within physically realistic solutions. In our study, the atmospheric temperature 
and humidity profiles from ERA Interim reanalysis covered the period from 1 February 2014 to 31 January 


2015 are used to compute the background covariance matrix B , referring to Boukabara et al for a detailed 


computing method'?”, The means of atmospheric parameters used to compute the background covariance 
matrix B are taken as the background state vector x’. The outputs from BP ANN retrieval method in 
section 4.2 are taken as the first guess profiles x, in our study. 


For the biases between simulations and observations, the statistical regression correction is given by (Li 


et al, 78): 


I; =a,1,, +b. (4) 


where I” is the corrected brightness temperatures of MWHTS, I is the brightness temperatures of 
MWHTS without correction, a isthe slope, b is the intercept, i=1~15 is the index of MWHTS channels, 


j=1~98 is the index of MWHTS scan positions. We determine the correction coefficients a and b 


through statistical analysis of the simulations and the observations from the statistical analysis dataset, and 
then correct the observed brightness temperatures from the testing dataset. 

For the error covariance matrix R , assuming that the measurements in one channel is not relate to that 
in the others, the diagonal elements of R are used. The differences between the simulations and the 
observations from the statistical analysis dataset and the sensitivities measured in flight in Table 1 are used. 

The convergence criterion adopted in our retrieval system is that the iteration in equation (3) is stopped 
when the relative difference of the cost function within two iterations is less than 0.01, and the maximum of 
iterations is set to 10, if the iterative times reach 10, the retrieval is set to the first guess. The retrieval quality 
control criterion selected in our study is that if the residuals between the corrected brightness temperatures 
and the simulated values output of RTTOV using the first guess in any channel are greater than 20 K, the 


observations are discarded. 
5. Validation and analysis of the retrieval results 


MWHTS channel WF distributions shown in Figure 1 indicate that the temperature channels are not 
sensitive to the top of atmosphere and the water vapor channels are not sensitive to the levels above 300 hPa, 
so we choose the retrievals at levels from 1000 to 50 hPa and 1000 to 250 hPa for temperature and relative 
humidity, respectively, to verify. MWHTS profile retrieval results are evaluated by mean error (ME) and 


RMSE with respect to ECMWF ERA Interim reanalysis taken as the truth. The ME and RMSE are defined 


as follows (Mathur et al. 291); 
1 N 
ME = N DE vers — Xecuwr) > (5) 
i=l 
RMSE = W 2 wis —Xecmwe) (6) 
i=l 


where Xywurs is MWHTS retrieval and X cmwr is ECMWF reanalysis. N is the total number of 


comparisons. Applying the three cloud filtering criteria suggested in our study to the testing dataset in 


section 2.2 to screen out the cloud- and precipitation-affected observations, we obtain three other testing 


datasets which are not affected by clouds and rain: testing dataset 1, testing dataset 2 and testing dataset 3 
including 35211, 35517 and 34021 collocated samples, respectively. Performing simulated calculations by 
RTTOV using the atmospheric parameters from this four testing dataset, the simulation accuracies are shown 
in Figure 6. It can be seen that the simulation accuracies for each channel have been improved obviously 
after filtering the cloud- and precipitation-affected observations in the testing dataset, and the criteria 3 can 
get the higher simulation accuracies than the criteria 1 and the criteria 2, which are the same conclusions as 
that of the statistical analysis datasets in section 3. In order to further evaluate the performance of the cloud 
filtering method in inversion of atmospheric parameters, the observed brightness temperatures in this four 


testing datasets are used to retrieve the atmospheric temperature and humidity profiles. 
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Figure. 6 Simulation accuracies by RTTOV in testing dataset before and after filtering. 
5.1. The retrieval results and analysis of MLR method 


The statistical analysis dataset in section 2.2 and the three statistical analysis datasets in section 3 are 
used to separately calculate the retrieval operator D , and the retrieval results are shown in Figure 7. For the 
temperature, the retrieval MEs of brightness temperatures in the testing dataset before and after filtering are 
nearly equal, reach their maximum , about 2.8 K , at level 70 hPa and 850 hPa, and reach their minimum , 
about 0 K, between level 200 hPa and 300 hPa, the mean errors are within 1.5 K at the other levels. However, 
the retrieval RMSEs of brightness temperatures with filtering are smaller than that without filtering between 
level 400 hPa and 1000 hPa, and the criteria 3 which has better performance in filtering out cloud- and 
precipitation-affected observations than the other criteria can obtain the highest retrieval accuracy. The 
RMSEs are almost the same as that between level 50 hPa and 400 hPa due to the temperature channels 
which are sensitive to the upper atmosphere and are nearly not affected by clouds and rain. The retrieval 
RMSEs reach their minimum 1.08 K near level 200 hPa, and reach their maximum, about 3.4 K near level 
850 hPa. For the humidity, it can be seen that the retrieval MEs of brightness temperatures before and after 


filtering are nearly equal which is the same to temperature retrieval results, is about 9% between level 500 


hPa and 600 hPa. For the other levels, the MEs are within 5%. The retrieval accuracies of brightness 
temperatures with filtering are higher than that of without filtering between level 300 hPa and 800 hPa, the 
reason is that the peak WF heights of the water vapor sounding channels are distributed mainly in this 
pressure range. The same to the temperature retrieval accuracy, the criteria 3 can obtain highest humidity 
retrieval accuracies and the RMSEs reach their maximum 19.5% near level 800 hPa. 

For the cloud filtering criteria developed in this paper, the criteria 3 makes the greater strides in 
improving the retrieval accuracy of multiple linear regression which use the linear relationship between 
atmospheric parameters and satellite observations, though the criteria 1 and the criteria 2 also can improve 
the retrieval accuracy to some extent. The filtering methods screen out the cloud- and precipitation-affected 
brightness temperatures, further reduce the non-linearity caused by clouds and rain and conform to the linear 
model used by the retrieval approach. However, it is worth noting that the relationships between atmospheric 


parameters and satellite observations are nonlinear, so the retrieval accuracy of multiple linear regression is 


low. 
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Figure. 7 Retrieval results of MLR. (a) Temperature, (b) Humidity. 
5.2. The retrieval results and analysis of ANN method 


The statistical analysis dataset in section 2.2 and the three statistical analysis datasets in section 3 are 
used to separately train the ANN and determine the weights and biases. The retrieval results are shown in 
Figure 8. For the temperature, the the retrieval MEs of brightness temperatures before and after filtering with 
the three criteria are nearly equal, their values are both within 1 K. The retrieval RMSEs of brightness 
temperatures with filtering are smaller than that without filtering between level 300 hPa and 200 hPa, but are 
higher than that between level 900 hPa and 800 hPa, and are nearly equal to that at the other levels. Their 
retrieval accuracies reach the maximum of about 2.5 K at level 100 hPa, and are within 1.8 K at the other 
levels. For the humidity, the the retrieval MEs of brightness temperatures before and after filtering with the 


three criteria all are about within 2%, and their RMSEs are nearly equal at all the pressure levels, reach their 


maximum about 15% at level 800 hPa. 

According to the analysis of the retrieval results of ANN method, it can be seen that the retrieval 
accuracy can't be improve by the cloud filtering method using the three criteria, and even worse at some 
levels. This is because the nonlinear relationships between the satellite observations and atmospheric 
parameters caused by the clouds and rain can be described by ANN due to its strong nonlinear mapping 
ability. However, the number of the samples trained in the statistical analysis dataset with and without 
filtering are different, the algorithm performance of ANN is affected by the number of training samples, 


which may lead to the worse retrieval accuracies. 
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Figure 8. Retrieval results of ANN. (a) Temperature, (b) Humidity. 
5.3. The retrieval results and analysis of IDVAR method 


The statistical analysis dataset in section 2.2 and the three statistical analysis datasets in section 3 are 
used to separately determine the correction coefficients of the biases between observations and simulations, 
the observed brightness temperatures in the four testing dataset are corrected and retrieved. The retrieval 
results are shown in Figure 9. For the temperature, the retrieval MEs of the brightness temperatures before 
and after filtering are nearly equal to that above levels 400 hPa of upper atmosphere, but at the other levels, 
the retrieval MEs of brightness temperatures with filtering are smaller than that without filtering obviously, 
and all of them are within 0.4 K. The retrieval MEs with the three criteria are nearly equal. The retrieval 
accuracies of the brightness temperatures with filtering are higher than that without filtering between levels 
400 hPa and 1000 hPa where the temperature channels 7, 8 and 9 are sensitive to, and the criteria 3 gets the 
highest accuracies 1.7 K between levels 150 hPa and 1000 hPa. For the humidity, the retrieval MEs and 
RMSEs of the brightness temperatures with filtering are significantly smaller than that without filtering 
especially between levels 200 hPa and 800 hPa where is sensitive to the water vapor channels, and the MEs 
are close to zero. The criteria 3 improve the humidity retrieval accuracies at most by 11.2%, and obtain the 
highest retrieval accuracies about 19%. 


From the viewpoint of physics, the 1DVAR method retrieve the atmospheric parameters by solving the 


radiative transfer equation directly. It can be seen from equation (3), in addition to the priori information, the 
differences between the observations and the simulations also can affect directly the retrieval accuracies. 
Filtering out cloud- and precipitation-affected observations can improve the retrieval accuracies of 
atmospheric temperature and humidity profiles. Compared with the criteria 1 and the criteria 2, the criteria 3 


can effectively screen out more cloud- and precipitation-affected observations, and improve the retrieval 


accuracies better. 
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Figure 9. Results of 1IDVAR. (a) Temperature, (b) Humidity. 
5.4. The comparision results of three retrieval methods 


According to the above analysis, for the temperature retrievals, we can find that the MEs and RMSEs 
of MLR method are the largest, the retrieval accuracies of IDVAR are slightly better than that of ANN 
method. For the humidity retrievals, both MEs of ANN and IDVAR method are nearly zero at all levels. The 
retrieval accuracies of ANN are the best, and the other methods have the similar retrieval accuracies. 
However, we should note that the retrieval accuracies of 1|DVAR method depend on many factors seen from 
equation (3). Therefore there are several ways to further improve the retrieval accuracy, such as optimizing 
the bias correction method, improving the first guess profile, further controlling quality of satellite 
observation and so on. 

From the viewpoint of the degree of difficulty in algorithm, the statistical inversion methods including 
multiple linear regression and ANN are simple and easy. In general, a good statistical model corresponds to 
the high retrieval accuracy, for example, the ANN trained in our study. However, it is worth noting that the 
retrieval algorithm stability is very important in operational applications of atmospheric parameters retrieval. 
In our study, we change the testing samples to study the stability of these three retrieval methods. MWHTS 
brightness temperatures covered the period from 1 to 31 May 2015 are selected, and the collected criteria 
with ERA Interim reanalysis are the same as that in section 2, then, we get 43137 collected samples in all 


weathers and 38004 collected samples after filtering by the criteria 3. The 38004 brightness temperature 


samples are used to retrieve the atmospheric temperature and humidity profiles using the three retrieval 
methods. The retrieval results are shown in Figure 10. For the MLR method, compared with the retrieval 
results using the criteria 3 in Figure 7, it can be seen that different retrieval results are obtained due to the 
different brightness temperatures retrieved. The temperature retrieval MEs and RMSEs both are significantly 
smaller than that of Figure 7 between level 400 hPa and level 1000 hPa, and the temperature retrieval 
accuracies are improved by about 1.0 K. The humidity retrieval accuracies decrease by 5% near level 500 
hPa. For the ANN method, compared with the retrieval results using the criteria 3 in Figure 8, the 
temperature retrieval MEs and RMSEs both are significantly increased, and the temperature retrieval 
accuracies decline about 0.4 K between level 400 hPa and level 800 hPa. The humidity retrieval accuracies 
decline 5% near level 800 hPa. However, for the 1DVAR method, compared with the retrieval results using 
the criteria 3 in Figure 9, it can be found that the retrieval accuracies are nearly unchanged. The above 
comparision analyses show that the MLR and ANN retrieval methods have great dependence on retrieved 


brightness temperatures which have very little impact on the retrieval accuracies of IDVAR. 
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Figure. 10 Comparision results of three retrieval methods. (a) Temperature, (b) Humidity. 
6. Summary and discussion 


MWHTS water vapor channels have high sensitivity to clouds and rains, and can detect different layers 
of clouds and rain well due to the new channels with frequency at 183.31 4.5 GHz. We have derived the 
cloud filtering method to screen out the cloud- and precipitation-affected observations using the brightness 
temperature differences between the water vapor channels. Based on the sensitivities of MWHTS water 
vapor channels 14 and 15 to the cloud water content, taking the RMSEs between observation and simulation 
by RTTOV in clear sky as reference values, we develop the cloud filtering criteria:A 15-11>12.5 and 
A 14-11>8.1. The proposed cloud filtering method can improve the agreement between simulated and 


observed brightness temperatures. In order to evaluate the cloud filtering method in the retrieval process, the 


brightness temperatures before and after filtering are used to retrieve the atmospheric temperature and 
humidity profiles through different retrieval methods. The retrieval results have shown that the cloud 
filtering method can effectively improve the retrieval accuracies of MLR and 1DVAR which depending on 
the nonlinearity of the radiative transfer model, but have little effect on that of ANN, because that ANN can 
represent the nonlinear relationship between the atmospheric parameters and satellite observations. The 
comparison of retrieval results using the three methods shows that the retrieval accuracies of MLR, which 
only describe the linear relationship between the atmospheric parameters and satellite observations, are 
worst. The ANN and the IDVAR can describe the non-linear relationship, have the similar temperature 
retrieval accuracies, but the humidity retrieval accuracy of the 1DVAR is lower than that of the ANN. From 
the viewpoint of operational application, the stability of the retrieval algorithms is tested, and the results 
show that the 1DVAR is the best choice. 

In this paper, we focus on the improvement of the differences of simulated and observed brightness. In 
general, the differences between the simulations and observations are caused by the following sources: the 
satellite instruments itself (e.g. poor calibration, or adverse environmental effects), the radiative transfer 
model linking the atmospheric parameters to the radiation measured by the sounder (e.g. errors in the 
spectroscopy, simplistic modeling of the viewing geometry of the sounder, or the inaccurate modeled 
scattering effect), and errors in the atmospheric parameters input to the radiative transfer model. However, 
filtering out the cloud- and precipitation-affected observations which is a pre-processing step is very 
important for the next bias correction which can further improve the retrieval accuracy. 

It is worth noting that in the proposed cloud filtering method the limb darkening is not taken into 
account for MWHTS which is a cross-track scanning instrument. However, from the viewpoint of statistical 
computation in this study, the effects of the limb darkening on the simulation accuracies may be filtering out 
in part using the cloud filtering method. The cloud filtering method may be optimized by deriving the cloud 
filtering criteria based on statistical analysis for different viewing geometry of MWHTS. This is our next 
investigation work. In addition, we only study MWHTS measurements over a part of ocean in this paper, 
therefore the cloud filtering method for MWHTS observations over global range is also a research direction 
in the future. 
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